Python data analytics : data analysis and science using Pandas, matplotlib, and the Python programming language
Bookreader Item Preview
Share or Embed This Item
texts
Python data analytics : data analysis and science using Pandas, matplotlib, and the Python programming language
- Publication date
- 2015
- Topics
- Python (Computer program language), Data mining, Python (Langage de programmation), Exploration de données (Informatique), Programming & scripting languages: general, Public administration, COMPUTERS -- Programming Languages -- Python, computerwetenschappen, computer sciences, gegevensverwerking, data processing, computertechnieken, computer techniques, programmeertalen, programming languages, Information and Communication Technology (General), Informatie- en communicatietechnologie (algemeen)
- Publisher
- [Berkeley, CA] : Apress
- Collection
- inlibrary; printdisabled; internetarchivebooks
- Contributor
- Internet Archive
- Language
- English
1 online resource (xxi, 337 pages)
Python Data Analytics will help you tackle the world of data acquisition and analysis using the power of the Python language. At the heart of this book lies the coverage of pandas, an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language. Author Fabio Nelli expertly shows the strength of the Python programming language when applied to processing, managing and retrieving information. Inside, you will see how intuitive and flexible it is to discover and communicate meaningful patterns of data using Python scripts, reporting systems, and data export. This book examines how to go about obtaining, processing, storing, managing and analyzing data using the Python programming language. You will use Python and other open source tools to wrangle data and tease out interesting and important trends in that data that will allow you to predict future patterns. Whether you are dealing with sales data, investment data (stocks, bonds, etc.), medical data, web page usage, or any other type of data set, Python can be used to interpret, analyze, and glean information from a pile of numbers and statistics. This book is an invaluable reference with its examples of storing and accessing data in a database; it walks you through the process of report generation; it provides three real world case studies or examples that you can take with you for your everyday analysis needs
Includes index
Online resource; title from PDF title page (SpringerLink, viewed September 1, 2015)
At a Glance -- Contents -- About the Author -- About the Technical Reviewer -- Acknowledgments -- Chapter 1: An Introduction to Data Analysis -- Data Analysis -- Knowledge Domains of the Data Analyst -- Computer Science -- Mathematics and Statistics -- Machine Learning and Artificial Intelligence -- Professional Fields of Application -- Understanding the Nature of the Data -- When the Data Become Information -- When the Information Becomes Knowledge -- Types of Data -- The Data Analysis Process -- Problem Definition -- Data Extraction -- Data Preparation -- Data Exploration/Visualization -- Predictive Modeling -- Model Validation -- Deployment -- Quantitative and Qualitative Data Analysis -- Open Data -- Python and Data Analysis -- Conclusions -- Chapter 2: Introduction to the Python's World -- Python-The Programming Language -- Python-The Interpreter -- Cython -- Jython -- PyPy -- Python 2 and Python 3 -- Installing Python -- Python Distributions -- Anaconda -- Enthought Canopy -- Python(x, y) -- Using Python -- Python Shell -- Run an Entire Program Code -- Implement the Code Using an IDE -- Interact with Python -- Writing Python Code -- Make Calculations -- Import New Libraries and Functions -- Data Structure -- Functional Programming (Only for Python 3.4) -- Indentation -- IPython -- IPython Shell -- IPython Qt-Console -- IPython Notebook -- The Jupyter Project -- PyPI-The Python Package Index -- The IDEs for Python -- IDLE (Integrated DeveLopment Environment) -- Spyder -- Eclipse (pyDev) -- Sublime -- Liclipse -- NinjaIDE -- Komodo IDE -- SciPy -- NumPy -- Pandas -- matplotlib -- Conclusions -- Chapter 3: The NumPy Library -- NumPy: A Little History -- The NumPy Installation -- Ndarray: The Heart of the Library -- Create an Array -- Types of Data -- The dtype Option -- Intrinsic Crea tion of an Array -- Basic Operations
Arit hmetic Operators -- The M atrix Product -- Increm ent and Decrement Operators -- Universal Functions (ufunc) -- Aggregat e Functions -- Indexing, Slicing, and Iterating -- Indexing -- Slicing -- Iterating an Array -- Conditions an d Boolean Arrays -- Shape Manipulation -- Array Manipulation -- Joining Arrays -- Splitting Arrays -- General Concepts -- Copies or Views of Objects -- Vectorization -- Broadcasting -- Structured Arrays -- Reading and Writing Array Data on Files -- Loading and Saving Data in Binary Files -- Reading File with T abular Data -- Conclusions -- Chapter 4: The pandas Library-An Introduction -- pandas: The Python Data Analysis Library -- Installation -- Installation from Anaconda -- Installation from PyPI -- Installation on Linux -- Installation from Source -- A Module Repository for Windows -- Test Your pandas Installation -- Getting Started with pandas -- Introduction to pandas Data Structures -- The Series -- Declaring a Series -- Selecting the Internal Elements -- Assigning Values to the Elements -- Defining Series from NumPy Arrays and Other Series -- Filtering Values -- Operations and Mathematical Functions -- Evaluating Values -- NaN Values -- Series as Dictionaries -- Operations between Series -- The DataFrame -- Defining a DataFrame -- Selecting Elements -- Assigning Values -- Membership of a Value -- Deleting a Column -- Filtering -- DataFrame from Nested dict -- Transposition of a DataFrame -- The Index Objects -- Methods on Index -- Index with Duplicate Labels -- Other Functionalities on Indexes -- Reindexing -- Dropping -- Arithmetic and Data Alignment -- Operations between Data Structures -- Flexible Arithmetic Methods -- Operations between DataFrame and Series -- Function Application and Mapping -- Functions by Element -- Functions by Row or Column -- Statistics Functions -- Sorting and Ranking
Correlation and Covariance -- "Not a Number" Data -- Assigning a NaN Value -- Filtering Out NaN Values -- Filling in NaN Occurrences -- Hierarchical Indexing and Leveling -- Reordering and Sorting Levels -- Summary Statistic by Level -- Conclusions -- Chapter 5: pandas: Reading and Writing Data -- I/O API Tools -- CSV and Textual Files -- Reading Data in CSV or Text Files -- Using RegExp for Parsing TXT Files -- Reading TXT Files into Parts or Partially -- Writing Data in CSV -- Reading and Writing HTML Files -- Writing Data in HTML -- Reading Data from an HTML File -- Reading Data from XML -- Reading and Writing Data on Microsoft Excel Files -- JSON Data -- The Format HDF5 -- Pickle-Python Object Serialization -- Serialize a Python Object with cPickle -- Pickling with pandas -- Interacting with Databases -- Loading and Writing Data with SQLite3 -- Loading and Writing Data with PostgreSQL -- Reading and Writing Data with a NoSQL Database: MongoDB -- Conclusions -- Chapter 6: pandas in Depth: Data Manipulation -- Data Preparation -- Merging -- Merging on Index -- Concatenating -- Combining -- Pivoting -- Pivoting with Hierarchical Indexing -- Pivoting from "Long" to "Wide" Format -- Removing -- Data Transformation -- Removing Duplicates -- Mapping -- Replacing Values via Mapping -- Adding Values via Mapping -- Rename the Indexes of the Axes -- Discretization and Binning -- Detecting and Filtering Outliers -- Permutation -- Random Sampling -- String Manipulation -- Built-in Methods for Manipulation of Strings -- Regular Expressions -- Data Aggregation -- GroupBy -- A Practical Example -- Hierarchical Grouping -- Group Iteration -- Chain of Transformations -- Functions on Groups -- Advanced Data Aggregation -- Conclusions -- Chapter 7: Data Visualization with matplotlib -- The matplotlib Library -- Installation -- IPython and IPython QtConsole
Matplotlib Architecture -- Backend Layer -- Artist Layer -- Scripting Layer (pyplot) -- pylab and pyplot -- pyplot -- A Simple Interactive Chart -- Set the Properties of the Plot -- matplotlib and NumPy -- Using the kwargs -- Working with Multiple Figures and Axes -- Adding Further Elements to the Chart -- Adding Text -- Adding a Grid -- Adding a Legend -- Saving Your Charts -- Saving the Code -- Converting Your Session as an HTML File -- Saving Your Chart Directly as an Image -- Handling Date Values -- Chart Typology -- Line Chart -- Line Charts with pandas -- Histogram -- Bar Chart -- Horizontal Bar Chart -- Multiserial Bar Chart -- Multiseries Bar Chart with pandas DataFrame -- Multiseries Stacked Bar Charts -- Stacked Bar Charts with pandas DataFrame -- Other Bar Chart Representations -- Pie Charts -- Pie Charts with pandas DataFrame -- Advanced Charts -- Contour Plot -- Polar Chart -- mplot3d -- 3D Surfaces -- Scatter Plot in 3D -- Bar Chart 3D -- Multi-Panel Plots -- Display Subplots within Other Subplots -- Grids of Subplots -- Conclusions -- Chapter 8: Machine Learning with scikit-learn -- The scikit-learn Library -- Machine Learning -- Supervised and Unsupervised Learning -- Training Set and Testing Set -- Supervised Learning with scikit-learn -- The Iris Flower Dataset -- The PCA Decomposition -- K-Nearest Neighbors Classifier -- Diabetes Dataset -- Linear Regression: The Least Square Regression -- Support Vector Machines (SVMs) -- Support Vector Classification (SVC) -- Nonlinear SVC -- Plotting Different SVM Classifiers Using the Iris Dataset -- Support Vector Regression (SVR) -- Conclusions -- Chapter 9: An Example-Meteorological Data -- A Hypothesis to Be Tested: The Influence of the Proximity of the Sea -- The System in the Study: The Adriatic Sea and the Po Valley -- Data Source -- Data Analysis on IPython Notebook -- The RoseWind
Calculating the Distribution of the Wind Speed Means -- Conclusions -- Chapter 10: Embedding the JavaScript D3 Library in IPython Notebook -- The Open Data Source for Demographics -- The JavaScript D3 Library -- Drawing a Clustered Bar Chart -- The Choropleth Maps -- The Choropleth Map of the US Population in 2014 -- Conclusions -- Chapter 11: Recognizing Handwritten Digits -- Handwriting Recognition -- Recognizing Handwritten Digits with scikit-learn -- The Digits Dataset -- Learning and Predicting -- Conclusions -- Appendix A: Writing Mathematical Expressions with LaTeX -- With matplotlib -- With IPython Notebook in a Markdown Cell -- With IPython Notebook in a Python 2 Cell -- Subscripts and Superscripts -- Fractions, Binomials, and Stacked Numbers -- Radicals -- Fonts -- Accents -- Appendix B: Open Data Sources -- Political and Government Data -- Health Data -- Social Data -- Miscellaneous and Public Data Sets -- Financial Data -- Climatic Data -- Sports Data -- Publications, Newspapers, and Books -- Musical Data -- Index
Python Data Analytics will help you tackle the world of data acquisition and analysis using the power of the Python language. At the heart of this book lies the coverage of pandas, an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language. Author Fabio Nelli expertly shows the strength of the Python programming language when applied to processing, managing and retrieving information. Inside, you will see how intuitive and flexible it is to discover and communicate meaningful patterns of data using Python scripts, reporting systems, and data export. This book examines how to go about obtaining, processing, storing, managing and analyzing data using the Python programming language. You will use Python and other open source tools to wrangle data and tease out interesting and important trends in that data that will allow you to predict future patterns. Whether you are dealing with sales data, investment data (stocks, bonds, etc.), medical data, web page usage, or any other type of data set, Python can be used to interpret, analyze, and glean information from a pile of numbers and statistics. This book is an invaluable reference with its examples of storing and accessing data in a database; it walks you through the process of report generation; it provides three real world case studies or examples that you can take with you for your everyday analysis needs
Includes index
Online resource; title from PDF title page (SpringerLink, viewed September 1, 2015)
At a Glance -- Contents -- About the Author -- About the Technical Reviewer -- Acknowledgments -- Chapter 1: An Introduction to Data Analysis -- Data Analysis -- Knowledge Domains of the Data Analyst -- Computer Science -- Mathematics and Statistics -- Machine Learning and Artificial Intelligence -- Professional Fields of Application -- Understanding the Nature of the Data -- When the Data Become Information -- When the Information Becomes Knowledge -- Types of Data -- The Data Analysis Process -- Problem Definition -- Data Extraction -- Data Preparation -- Data Exploration/Visualization -- Predictive Modeling -- Model Validation -- Deployment -- Quantitative and Qualitative Data Analysis -- Open Data -- Python and Data Analysis -- Conclusions -- Chapter 2: Introduction to the Python's World -- Python-The Programming Language -- Python-The Interpreter -- Cython -- Jython -- PyPy -- Python 2 and Python 3 -- Installing Python -- Python Distributions -- Anaconda -- Enthought Canopy -- Python(x, y) -- Using Python -- Python Shell -- Run an Entire Program Code -- Implement the Code Using an IDE -- Interact with Python -- Writing Python Code -- Make Calculations -- Import New Libraries and Functions -- Data Structure -- Functional Programming (Only for Python 3.4) -- Indentation -- IPython -- IPython Shell -- IPython Qt-Console -- IPython Notebook -- The Jupyter Project -- PyPI-The Python Package Index -- The IDEs for Python -- IDLE (Integrated DeveLopment Environment) -- Spyder -- Eclipse (pyDev) -- Sublime -- Liclipse -- NinjaIDE -- Komodo IDE -- SciPy -- NumPy -- Pandas -- matplotlib -- Conclusions -- Chapter 3: The NumPy Library -- NumPy: A Little History -- The NumPy Installation -- Ndarray: The Heart of the Library -- Create an Array -- Types of Data -- The dtype Option -- Intrinsic Crea tion of an Array -- Basic Operations
Arit hmetic Operators -- The M atrix Product -- Increm ent and Decrement Operators -- Universal Functions (ufunc) -- Aggregat e Functions -- Indexing, Slicing, and Iterating -- Indexing -- Slicing -- Iterating an Array -- Conditions an d Boolean Arrays -- Shape Manipulation -- Array Manipulation -- Joining Arrays -- Splitting Arrays -- General Concepts -- Copies or Views of Objects -- Vectorization -- Broadcasting -- Structured Arrays -- Reading and Writing Array Data on Files -- Loading and Saving Data in Binary Files -- Reading File with T abular Data -- Conclusions -- Chapter 4: The pandas Library-An Introduction -- pandas: The Python Data Analysis Library -- Installation -- Installation from Anaconda -- Installation from PyPI -- Installation on Linux -- Installation from Source -- A Module Repository for Windows -- Test Your pandas Installation -- Getting Started with pandas -- Introduction to pandas Data Structures -- The Series -- Declaring a Series -- Selecting the Internal Elements -- Assigning Values to the Elements -- Defining Series from NumPy Arrays and Other Series -- Filtering Values -- Operations and Mathematical Functions -- Evaluating Values -- NaN Values -- Series as Dictionaries -- Operations between Series -- The DataFrame -- Defining a DataFrame -- Selecting Elements -- Assigning Values -- Membership of a Value -- Deleting a Column -- Filtering -- DataFrame from Nested dict -- Transposition of a DataFrame -- The Index Objects -- Methods on Index -- Index with Duplicate Labels -- Other Functionalities on Indexes -- Reindexing -- Dropping -- Arithmetic and Data Alignment -- Operations between Data Structures -- Flexible Arithmetic Methods -- Operations between DataFrame and Series -- Function Application and Mapping -- Functions by Element -- Functions by Row or Column -- Statistics Functions -- Sorting and Ranking
Correlation and Covariance -- "Not a Number" Data -- Assigning a NaN Value -- Filtering Out NaN Values -- Filling in NaN Occurrences -- Hierarchical Indexing and Leveling -- Reordering and Sorting Levels -- Summary Statistic by Level -- Conclusions -- Chapter 5: pandas: Reading and Writing Data -- I/O API Tools -- CSV and Textual Files -- Reading Data in CSV or Text Files -- Using RegExp for Parsing TXT Files -- Reading TXT Files into Parts or Partially -- Writing Data in CSV -- Reading and Writing HTML Files -- Writing Data in HTML -- Reading Data from an HTML File -- Reading Data from XML -- Reading and Writing Data on Microsoft Excel Files -- JSON Data -- The Format HDF5 -- Pickle-Python Object Serialization -- Serialize a Python Object with cPickle -- Pickling with pandas -- Interacting with Databases -- Loading and Writing Data with SQLite3 -- Loading and Writing Data with PostgreSQL -- Reading and Writing Data with a NoSQL Database: MongoDB -- Conclusions -- Chapter 6: pandas in Depth: Data Manipulation -- Data Preparation -- Merging -- Merging on Index -- Concatenating -- Combining -- Pivoting -- Pivoting with Hierarchical Indexing -- Pivoting from "Long" to "Wide" Format -- Removing -- Data Transformation -- Removing Duplicates -- Mapping -- Replacing Values via Mapping -- Adding Values via Mapping -- Rename the Indexes of the Axes -- Discretization and Binning -- Detecting and Filtering Outliers -- Permutation -- Random Sampling -- String Manipulation -- Built-in Methods for Manipulation of Strings -- Regular Expressions -- Data Aggregation -- GroupBy -- A Practical Example -- Hierarchical Grouping -- Group Iteration -- Chain of Transformations -- Functions on Groups -- Advanced Data Aggregation -- Conclusions -- Chapter 7: Data Visualization with matplotlib -- The matplotlib Library -- Installation -- IPython and IPython QtConsole
Matplotlib Architecture -- Backend Layer -- Artist Layer -- Scripting Layer (pyplot) -- pylab and pyplot -- pyplot -- A Simple Interactive Chart -- Set the Properties of the Plot -- matplotlib and NumPy -- Using the kwargs -- Working with Multiple Figures and Axes -- Adding Further Elements to the Chart -- Adding Text -- Adding a Grid -- Adding a Legend -- Saving Your Charts -- Saving the Code -- Converting Your Session as an HTML File -- Saving Your Chart Directly as an Image -- Handling Date Values -- Chart Typology -- Line Chart -- Line Charts with pandas -- Histogram -- Bar Chart -- Horizontal Bar Chart -- Multiserial Bar Chart -- Multiseries Bar Chart with pandas DataFrame -- Multiseries Stacked Bar Charts -- Stacked Bar Charts with pandas DataFrame -- Other Bar Chart Representations -- Pie Charts -- Pie Charts with pandas DataFrame -- Advanced Charts -- Contour Plot -- Polar Chart -- mplot3d -- 3D Surfaces -- Scatter Plot in 3D -- Bar Chart 3D -- Multi-Panel Plots -- Display Subplots within Other Subplots -- Grids of Subplots -- Conclusions -- Chapter 8: Machine Learning with scikit-learn -- The scikit-learn Library -- Machine Learning -- Supervised and Unsupervised Learning -- Training Set and Testing Set -- Supervised Learning with scikit-learn -- The Iris Flower Dataset -- The PCA Decomposition -- K-Nearest Neighbors Classifier -- Diabetes Dataset -- Linear Regression: The Least Square Regression -- Support Vector Machines (SVMs) -- Support Vector Classification (SVC) -- Nonlinear SVC -- Plotting Different SVM Classifiers Using the Iris Dataset -- Support Vector Regression (SVR) -- Conclusions -- Chapter 9: An Example-Meteorological Data -- A Hypothesis to Be Tested: The Influence of the Proximity of the Sea -- The System in the Study: The Adriatic Sea and the Po Valley -- Data Source -- Data Analysis on IPython Notebook -- The RoseWind
Calculating the Distribution of the Wind Speed Means -- Conclusions -- Chapter 10: Embedding the JavaScript D3 Library in IPython Notebook -- The Open Data Source for Demographics -- The JavaScript D3 Library -- Drawing a Clustered Bar Chart -- The Choropleth Maps -- The Choropleth Map of the US Population in 2014 -- Conclusions -- Chapter 11: Recognizing Handwritten Digits -- Handwriting Recognition -- Recognizing Handwritten Digits with scikit-learn -- The Digits Dataset -- Learning and Predicting -- Conclusions -- Appendix A: Writing Mathematical Expressions with LaTeX -- With matplotlib -- With IPython Notebook in a Markdown Cell -- With IPython Notebook in a Python 2 Cell -- Subscripts and Superscripts -- Fractions, Binomials, and Stacked Numbers -- Radicals -- Fonts -- Accents -- Appendix B: Open Data Sources -- Political and Government Data -- Health Data -- Social Data -- Miscellaneous and Public Data Sets -- Financial Data -- Climatic Data -- Sports Data -- Publications, Newspapers, and Books -- Musical Data -- Index
- Access-restricted-item
- true
- Addeddate
- 2022-11-07 14:01:46
- Autocrop_version
- 0.0.14_books-20220331-0.2
- Bookplateleaf
- 0006
- Boxid
- IA40756604
- Camera
- USB PTP Class Camera
- Collection_set
- printdisabled
- External-identifier
-
urn:lcp:pythondataanalyt0000nell:lcpdf:17bc71dc-73a6-40d8-aed2-45052a139d78
urn:lcp:pythondataanalyt0000nell:epub:bc766f5e-d467-4c1d-93f4-026e19a0357f
urn:oclc:record:919252330
- Foldoutcount
- 0
- Identifier
- pythondataanalyt0000nell
- Identifier-ark
- ark:/13960/s2xx4jwfjtc
- Invoice
- 1652
- Isbn
-
9781484209585
1484209583
1484209591
9781484209592
- Ocr
- tesseract 5.2.0-1-gc42a
- Ocr_detected_lang
- en
- Ocr_detected_lang_conf
- 1.0000
- Ocr_detected_script
- Latin
- Ocr_detected_script_conf
- 0.9476
- Ocr_module_version
- 0.0.18
- Ocr_parameters
- -l eng
- Old_pallet
- IA-NS-1200588
- Openlibrary_edition
- OL27508289M
- Openlibrary_work
- OL20305145W
- Page_number_confidence
- 91.58
- Pages
- 370
- Pdf_module_version
- 0.0.20
- Ppi
- 360
- Rcs_key
- 24143
- Republisher_date
- 20221107172414
- Republisher_operator
- associate-cecelia-atil@archive.org
- Republisher_time
- 178
- Scandate
- 20221102123824
- Scanner
- station47.cebu.archive.org
- Scanningcenter
- cebu
- Scribe3_search_catalog
- isbn
- Scribe3_search_id
- 9781484209592
- Tts_version
- 5.2-initial-114-g7c4a60b4
- Worldcat (source edition)
- 919479279
- Full catalog record
- MARCXML
comment
Reviews
There are no reviews yet. Be the first one to
write a review.
301 Previews
30 Favorites
DOWNLOAD OPTIONS
No suitable files to display here.
PDF access not available for this item.
Uploaded by station47.cebu on